Streaming-Data Algorithms for High-Quality Clustering

نویسندگان

  • Liadan O'Callaghan
  • Adam Meyerson
  • Rajeev Motwani
  • Nina Mishra
  • Sudipto Guha
چکیده

Streaming data analysis has recently attracted attention in numerous applications including telephone records, web documents and clickstreams. For such analysis, single-pass algorithms that consume a small amount of memory are critical. We describe such a streaming algorithm that e ectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Hybrid algorithms for Job shop Scheduling Problem with Lot streaming and A Parallel Assembly Stage

In this paper, a Job shop scheduling problem with a parallel assembly stage and Lot Streaming (LS) is considered for the first time in both machining and assembly stages. Lot Streaming technique is a process of splitting jobs into smaller sub-jobs such that successive operations can be overlapped. Hence, to solve job shop scheduling problem with a parallel assembly stage and lot streaming, deci...

متن کامل

Intrusion Detection System with Data Stream Clustering Approach

fast and high-quality Intrusion Detection algorithms play an important role in providing security management component by organizing large amounts of information into a small number of meaningful clusters. In particular, clustering algorithms that build meaningful groups of data via network log file are ideal tools for their interactive visualization and exploration as they provide a powerful m...

متن کامل

XStreamCluster: An Efficient Algorithm for Streaming XML Data Clustering

XML clustering finds many applications, ranging from storage to query processing. However, existing clustering algorithms focus on static XML collections, whereas modern information systems frequently deal with streaming XML data that needs to be processed online. Streaming XML clustering is a challenging task because of the high computational and space efficiency requirements implicated for on...

متن کامل

Stream Data Mining: Platforms, Algorithms, Performance Evaluators and Research Trends

Streaming data are potentially infinite sequence of incoming data at very high speed and may evolve over the time. This causes several challenges in mining large scale high speed data streams in real time. Hence, this field has gained a lot of attention of researchers in previous years. This paper discusses various challenges associated with mining such data streams. Several available stream da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002